Probabilistic Graphical Models

Probabilistic Graphical Model

= a joint probability distribution that uses a graph structure to encode conditional independence assumptions.

Graph structure

A Graph (or Network) is a set of entities (called Nodes or Vertices) connected through Edges.

Directed Acyclic Graph (DAG)

= a special kind of Probabilistic Graphical Models#Graph structure with:

  1. directed edges (every edge has a direction: from parent → child)
  2. no cycles — you can’t start at a node and follow a path that loops back to it

Bayesian Network

= a type of DAG that represents probabilistic relationships between variables (although there is nothing inherently Bayesian about such models)

Basic Causal Structures

Confound type DAG structure (schematic) Description Adjustment implications / Take care... Example
The Fork X ← Z → Y Z is a common cause of X and Y. This creates a spurious association between X and Y if not controlled. Must adjust for Z to block confounding. Age (Z) affects both exercise (X) and health outcomes (Y).
The Pipe X → Z → Y Z is a mediator (on the causal pathway from X to Y). Adjusting for Z blocks part of the causal effect → post-treatment bias. Don’t condition if interested in total effect. Smoking (X) → blood pressure (Z) → heart disease (Y).
The Collider X → Z ← Y Z is a collider (caused by both X and Y). By default, X and Y are independent. Conditioning on Z opens a spurious path → collider bias. Do not adjust for Z. Genetics (X) and environment (Y) both affect admission to college (Z). Conditioning on Z creates a false association.

Identification Strategies

Back-Door Criterion

= A graphical method for identifying a set of variables to adjust for, in order to estimate the causal effect of X on Y

P(Ydo(X))=zP(YX,Z=z)P(Z=z)

Front-Door Criterion

= A graphical method that can identify causal effects even with unmeasured confounders, if there exists a mediator we can observe.

P(Ydo(X))=mP(MX)xP(YM,X=x)P(X=x)